Introduction to Data

Spring 2026

What the **** are data?

Data are a means to represent the world

Context Matters

Semantics

  • Row = observation, record, case, example, instance, pattern, sample
  • Columns = variable, field, feature, attribute, input, predictor, dimension

Categorical Data

  • Nominal: Unordered category
  • Ordinal: Ordered category
  • Both can be binary or multinomial

Numeric Data

  • Continuous: Can take on any number
    • Interval: Distance between values are equal and meaningful
      • Numbers are ‘arbitrary’ and lack a 0 point
      • IQ, temperature, etc.
    • Ratio: Defined 0 point. Cannot fall below 0.
  • Discrete: Can only take on certain numbers. There are ‘gaps’ between numbers.
    • Counts & Integers (whole numbers)

A Note about Research Design

  • Qualitative Research: Descriptive statements to seek answers
  • Quantitative research: measurements to seek answers from qualitative or quantitative data
    • Data Science
  • Less precise: Qualitative / categorical
  • More precise: Quantitative / continuous

Data Types & R

Common Data Types

Type Definition Example
Double Whole or floating number 5 or 5.73
Integer Whole number 5, 2, 3L
Character Individual or strings of non-numbers “c”, “cat”, “cat in the hat”
Factor Categorical or discrete variables M/F, S/M/L
Boolean Binary Categories T/F

Data Type in R

Numbers

[1]  4.12  4.57  5.00 17.00

Characters

[1] "M"       "male"    "F"       "cat"     "Cat-Dog"

Factors

[1] M M F M
Levels: F M

Boolean

[1]  TRUE FALSE  TRUE FALSE

Special Data Types




NULL
[1] NA
[1] NaN
[1] Inf

Data Modes

Each variable / object has a data mode that umbrellas by data type.

Numeric: * Both integers and doubles * Includes factors

Character: * Characters and strings

Logical: * Boolean TRUE and FALSE

The mode() function returns the type of data mode.

mode(42)
[1] "numeric"
a <- "beer"
mode(a)
[1] "character"
mode(T)
[1] "logical"
mode(as.factor("M"))
[1] "numeric"

Checking and Converting

is.numeric(2)
[1] TRUE
is.numeric(a)
[1] FALSE
is.character("a")
[1] TRUE
as.character(4)
[1] "4"
as.numeric(4)
[1] 4